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IN THE PAST DECADE OR SO, LINGUISTIC SCIENCE 
has paid increasing attention to the communi- 
cative—or message transmission—aspect pf 
language, as opposed to an carlicr almost exclu- 
sive emphasis on grammatical forms and the 
ways in which they could be combined to form 
grammatical sentences. This shift in our under- 
standing of what it is that native speakers of 
a language “know” and “can do" has Jed to wide- 
spread rethinking about what the content and 
organization of second-language courses should 
be. More recently, attention has increasingly 
been given to how the communicative ability 
of second-language speakers can be tested. 
Some excellent discussions of this question now 
exist in the literature.! This article is an attempt 
to synthesize for the informed reader some of 
the main principles as welt as some of the un- 
resolved issues which have emerged from these 
investigations of communicative aspects of lan- 
guage, and to describe several promising ap- 
proaches to developing “communicative” tests. 


WHAT IT IS TO “KNOW” A LANGUAGE: LINGUISTIC 
AND COMMUNICATIVE COMPETENCE 


Chomsky’s early, hypotheses about the nature 
of language knowledge were influential in 
focusing the interests of linguists on formulat- 
ing the linguistic rules which could describe or 
“generate” grammatical sentences. “Compe- 
tence” was viewed as the native speaker's in- 
ternalized grammar, or regularities underlying 
his use of the linguistic code (c.g., phonology, 
lexicon, and syntax). Since competence cannot 
be directly observed, its imperfect realization 
in concrete situations, or “performance,” al- 
though of secondary interest to linguists, was 
important for the information it a | provide 
about competence. Subsequ joe 
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has convincingly demonstrated that the native 
speaker has, in addition a linguistic 
code grammar, other higher-order internalized 
rule systems which are also important in de- 
termining his or her language behaviour.? 
These include syntactic and si H C 
course level which clarify propositional content 


(reflected in what Widdowson calls the “cohe- 


sion*-of-the discourse I hey also—inehude 
underlying regularities in the ordering of the 
language lanna epr tn ap rse 
whieh are Te determined a 
such as speaker intention, the roles and rela- 
tionships of participants, and other contextual 
features. These TERRIS irre ttgtutionary 
development of discourse comprise what Wid- 
dowson calls ts ‘coherence, and represent one 
interface between ic code and real- 

world piei E G © these dis- 
course phenomena, further extralinguistic con- 
straints on the linguistic forms of verbal be- 
haviour are apparent. For example, sociolin- 
guistic considerations of appropriateness, given 
the relationship between participants or-the for- 
mality of a communicative context, may deter- 
mine the actual choice and ardering of linguistic 
forms above_and beyond considerations of 


Propositional and illocutionary meaning. Non- 
verbal] conimunicative behaviour is also inti- 
mately linked t0 verbal behaviour, and thus to 


the particular forms of the linguistic code. 
Canale SH Sea ave reed ahe to these regu- 
larities in the pragmatic relationships between 
the Tee gS Fone Set ie ex netic con- 
text as “Fule-governed, universal and creative 
aspect Yaran Saa Z 

Thus there are multiple levels of verbal be- 
haviour at which one may speak of probabili- | 


ties and constraints, or rule-systems. These 


include, at least, grammatical, discourse, and ” 
sociolinguistic levels. All of these levels of com- 


pelence are involved in the structuring of com- 
munication and contribute to the propositional 
|coniribute to 


and social meanings of utterances. Their 
——_— 
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psychological representation, as an integrated, 
internalized system for making partial predic: 
tions about what will come next in a verbal 
exchange, has been characterized by Oller as 
a “grammar of expectancy.”$ The existence of 
this grammar ol expectancy becomes apparent 
when our expectations about what will happen 
are violated. We will reaet with surprise. orrpay 
even not understand what is said to us if its 
yem or content is somehow unexpected. On 

the other hand, we ohen do not even notice 


mistakes made by a speaker when we are 
attending to meaning.® Meaning t ot 
exist ready-made in the linguistic code, butis 
rather a function of the relationships betyeen 
language forms, functions, and context, includ- 
ing the jntentions of the speaker and the ex- 
D pectations of the hearer. ae 

In this perspective language competence is 
viewed as a complex system of rule sets which 
operate simultaneously at many levels to de- 
termine the organization of grammatical forms 
for the fulfillment of communicative and other 
language functions. 


Language competence is 
"not “additive,” or the sum of discrete sets of 
syntactic, phonological, morphological, seman- 
Te and discourseevel Tema and organisa 
tional systems. Rather the whole of language 
e Ta greater Than The arm of the che 
miens 200 systems which compose it Com- 
municative competence (which includes linguis- 
TE competence), refers, then, to an integrated 
System ol knowledge whose lunchioni ge whose Tunclioning 
d least as much upon the redundancy 
betwesn the ar eat of See 
upon any particular stock of items and pat- 
esa a ee eee 
Language testing which does not take into 


account pro| and illocutionary deyel- 
opment beyond The Sentence Tevel, pa ase a 

e betmeer Tongans behaioas 
(verbir ana non-verbal) and restworld phe. 


nomena, ts ar best getting at only a part of 
communicative competence, Small wonder that 
we often find that a student's success at secand- 
language classroom exercises and tests appears 
to bearJittle relationship to his or her ability 
to use the language effectively in a real-world 
sian —————___—— 
Morrow has identified a number of features 


of language as it is used in communication 
which have been generally ignored in the teach- 


ing and testing of second languages.” Language 
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în communication is, for example, mieraction- 
based (ue. occurs 
specch act, cach of whom dynamically influ- 


ences The imyuistic behaviour of the other in 
a multitude of ways). Tt is to some extent un- 
predictable (although it has norms) so that par- 
ticipants must process new information under 
ree C always purpose ( ys purposive (is 
intended to fulfill some communicatiy: = 
Osa Rich an REE nian bee - 
TT 
level of formality, etc.), and 
is related Wo The behaviour of the participants ani 
course) and an extralinguistic_canlex) 
The implications of the psycholinguistic 
model of language competence and of our 
understanding of the nature of language in use, 
as described above, are profound. If we aim to 


evaluate the communicative abilities of second- 
language Tearners and speakers, we necd to test 


all levels or competence —simultancously._ In 
other words, we need to engage the examinee’s 
grammar of expectancy. And to do this, the 
language and the tasks that we use in our tests 


in use. 


ACQUIRING COMMUNICATIVE COMPETENCE 


Similar considerations hold for second-lan- 
guage acquisition, Researchers from several 
perspectives have found evidence that second 
languages are most efficiently acquired through 
use in meaningful, naturalistic situations, i.e., 
when language is used for communicative pur- 
poses, when realistic extralinguistic as well as 
verbal contexts are present or implied, and 
when all levels of the learner's language pro- 
cessing systern, or grammar of expectancy, are 
activated.' As Oller has expressed it, the brain 
does not store phonemes and morphemes ac- 
cording to their category of linguistic analysis, 
but rather mapped onto contexts, Human 
information processing abilities appear to func- 
tion in this way, so that the higher the level at 
which language is contextualized (i.c., speech 
sounds contextualized in morphemes and 
words, or words in sentences, versus utterances 
contextualized in discourse, or discourse in 
extralinguistic context), the more effective lan- 
guage perception, processing and acquisition 
will be.t! 

Information-processing theory would at least 


Communicative Testing in 1.2 


partially attribute this pheriomenon to the mul- 
tiple associations which are simuhaneiusty 
made with new linguistic material when it is 
contextualized at all possible levels. Asia: 
tions made both within the linguistic sysicm 
and with extralinguistic phenomena will rein- 
force cach other, and will lead 10 bener reten- 
tion and to multiple access channels in long- 
term memory. When second-language learners 
interact with native speakers and use authentic 
written language materials these associations 
will better represent native speaker probal 
rules for the relationships among forms in an 
utterance, among utterances in discourse, and 
for the communicative functions likely 10 be 
expressed in a given context. (Stevick would, 
of course, also emphasize the personal mean- 
ingfulness or “depth” of learner involvement in 
the communicative act, as important in en- 
hancing memory processes. Such involvement 
ensures learner response and more elaborated 
or “deeper” processing of the stimulus. Jt thus 
leads to better long term retention and a greater 
likelihood of access to the new knowledge when 
a subsequent context offers appropriate cucs.}!? 
Research, then, suggests that second 
guages are best acquired as well as tested 
through their naturalistic use in context. 


CHARACTERISTICS OF LANGUAGE TESTS 


Much work is currently in progress in the 
area of communicative syllabus design. In test- 
ing, however, in spite of recognition of the need 
for communicative tests, recent advances in the 
development of,procedures and guidelines for 
such tests, and widespread experimentation 
with communicative tests for specific purposes, 
the field of testing communicative competence 
can at best be said to be in an embryonic 
stage." 

Communicative tests share a number of 
characteristics with other kinds of language 
tests. They tend to have onc of three categorics 
of purpose: evaluation of language proficiency 
(for placement in language courses, admission 
to programs, certification, etc.), diagnosis of par- 
ticular areas of strength or weakness in lan- 
guage proficiency, and evaluation of arhicuement 
relevant to a particular instructional unit or 
program. Tests of all types can also scrve to 
motivate students in their language learning 
endeavor (or the contrary), and may also be 
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used in the evaluation of instructional pro- 
Kram. 

Alb tests are samples of behaviour, intended 

te reflect whether the examinee possesses cere 

n knowledge, or to predict whether he ur she 
can perforin certain acts. Tests generally con- 
sist of a number of items, cach compused of 
stimulus material and a related task which requires 
a response on the part of the examinee. Re- 
sponses are then scored according 10 certain eri- 
teria, 

Other important characteristics of a test in- 
clade its validity (does the test measure what it 
is intended 10 measure?), reliability (will it func- 
tion in the sume way, i.e., will the score gained 
approach the “true score” of the examinee cach 

e it is given, and in a consistent way with 
different examinees?), and feasibility (is it too 
expel and time-consuming with respect to 
development, administration, or scoring?). 
(Feasibility is a relative concept based on the 
time and resources available, and is a con- 
sideration which often causes compromises with 
respect to validity and reliability.) Tests should 
also be interesting, and should not provoke 
undue anxiety (in the interest of validity and 
reliability, so that the examinee’s best perform- 
ance will be measured). 

Davies has discussed three continua along 
which tests may be categorized which illumi- 
nate several current issues in communicative 
testing. These are: discrete point versus integra- 
tive, indirect versus direct, and norm versus criterion 
reoferenced.* The first refers to the degree to 
which a test emphasizes isolated bits of lan- 
guage knowledge versus more global abilities. 
“Diserete point” has traditionally referred to 
tests of grammar in which single sentences are 
the maximum stimulus unit presented or re- 
sponse required. Such tests also use separate 
tasks or subtests to assess the “four skills.” 
(Some current writers use the term for all tests 
in which onc element at a time is cmphasized, 
thus including tests of isolated communicative 
functions, However, such an extension of 
meaning undermines the usefulness of the dis- 
tinction between discrete point and integrative 
tests, since test tasks could conceivably be both 
at the same time.) The essential point is that 
discrete point tests select particular levels of 
Janguage code organization (c.g., phonology, 
syntax) to test grammatical points (e.g., speech 
sound discrimination, correct choice of preposi- 
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tions), while generally dealing with only one 
“skill” (reading, writing, listening, or sp 
Test stimuli and/or responses can be 
point. The higher the level of language organi- 
zation used in a test, the more “integrative” it 
is. Tests at the integrative f the scale pre- 


sent language in discourse (rather than at the 
sentence, or word, or syllable) level, and acti- 


vate the expectancy grammar. (Examples in- 
clude auditory and written cloze, dictation, 
interviews, reading and listening passages in 
which questions require global comprehension, 
and written essays.) 

Oller distinguishes that subset of integrative 
tests which also take into account appropriate 
extralinguistic context as “pragmatic” tests, and 
maintains that only with such tests can we 
validly measure language proficiency (i.e., 
communicative competence and perform- 
ancc).16 Some authors argue that discrete point 
tests (in the old sense) have a place when the 
purpose of the test is diagnostic, 7 but Oller and 
others maintain that even for this purpose inte- 
grative (or pragmatic) tests are best; that analy- 
sis of the examince's errors on an integrative 
task can tell more about his or her specific lan- 
guage problems than any highly artificial dis- 
crete point task. The implications for com- 
municative competence testing would seem 
clear; our test tasks should be as high on the 
integrative scale as possible; the tests should be 
pragmatic. At the same time, in our analysis 
of examinee responses we can look at very spe- 
cific aspects of performance if that fits our 
objectives (i.c., “discrete point” scoring in 
Morrow's and Howard's sensc.)'* 

‘The indirect versus direct continuum refers 
to the degree to which a test task approaches 
the actual criterion performance. A direct test 
at the extreme end of the scale would involve 
actually performing the criterion behaviour in 
a real-life situation (i.¢., a university student 
demonstrating adequate second-language mas- 
tery by successfully functioning in a course of 
study in the host country; a public servant effi- 
ciently taking an office telephone message in 
the second language). Something short of direct 
testing is needed to predict how the examinee 
will function in the real-life situation. The tests 
which we devise for this purpose are always a 
simulation of some kind, but there is a range 
of artificiality or authenticity possible in a test- 
ing situation. Indirect testing involves using a 
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sample of relatively unrelated behaviour to pre 
dict the examince’s performance in the event 
target situation. The Test of English as a Fop, 
cign Language (TOEFL) is an example of , 
relatively indirect measure.'* A written essa, 
or true-false grammar test used to qualify toy, 
guides, or an oral interview used to ceri} 
graduate students as being able to do lib, 
rescarch in their second language, are at the 
extreme indirect end of the scale with respeg 
to their objectives. Likewise, a written doze teg 
used to measure speaking proficiency isan ind}. 
reci measure. Indirect measures are often 
favoured for reasons of practicality. There may 
be some cases where results of indirect tests cor. 
relate quite highly with more direct measures 
of criterion performance. However, in any 
given case this must be demonstrated. In com. 
municative testing, it would seem that our testy 
should be as direct as possible, and that any 
indirect measures must be shown to reliably 
predict the criterion performance in real-life 
language use or at least to have concurrent 
validity with more direct measures, 

The issue of norm versus criterion referenc- 
ing refers to the degree to which an examinec's 
performance on a test is judged in terms of the 
performance of others (as in standardized tests 
which yield percentile rankings), versus the 
degree to which it is a measure of his or her 
progress toward a specific objective and level 
of performance (as in a driving test, where a 
minimum score must be achieved regardless of 
the performance of others). Obviously, cri- 
terion referencing is only possible where spe- 
cific objectives exist. The levels required may 
also be based on norm referencing, as an indi- 
cation of what degree of proficiency may be 
reasonably expected. In communicative testing, 
norm referencing is probably appropriate when 
the purpose is to determine placement in a 
multi-level instructional program. But it would 
seem that achievement tests in instructional 
programs where learners need to reach a cer- 
tain level of performance for given objectives, 
or where different learners are working toward 
different objectives (as often happens at ad- 
vanced levels) should give both students and 
instructors feedback on how far the students 
have advanced toward cach objective. Test 
tasks should then ideally be designed to give 
not only a “yes” or “no” answer as to whether 
the examinee can “do” a task, but should indi- 
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cate how well he or she can do it relative to how 
well he or she needs to do it, In some programs 
it may be appropriate to use bath criterion and 
norm referencing; in others, onc or the other 
may be most appropriate. Because communica- 
tive testing presumes adequate definition of 
objectives, based on Jearner needs, criterion 
referencing is a possibility. 


SPECIFYING OBJECTIVES FOR COMMUNICATIVE 
LANGUAGE TESTS 

The format, content, and scoring criteria of 
any test should reflect its objectives. In the case 
of a communicative second-language test, these 
objectives should be expressed in terms of what 
the examinee will be able to “do” in the target 
language in a naturalistic situation; ie., 
whether he or she will be able to use the Jan- 
guage effectively for a given communicative 
purpose. We are thus not talking about “be- 
havioural objectives” in the sense that they have 
sometimes been understood in second-language 
programs (e.g., “after hearing a sentence with 
two adverbial phrases, using the present con- 
tinuous or simple present of ‘to be’ and most 
frequent verbs, . . . choose the best answer 
among four choices.”)?" 

Objectives are more easily formulated when 
the language is being learned for a specific pur- 
pose by persons with certain characteristics 
(i.c., businessmen for routine travel needs in 
a particular country; university students for a 
course of study in the second language; immi- 
grant children for integration into a new edu- 
cational and cultural milieu), but it is also 
important to formulate such objectives for gen- 
eral purpose courses and examinations, These 
objectives will then serve as the basis for 
syllabus and test design. 

The first step in determining such objectives 
is to describe the learner/examinee's second- 
language needs. Such needs may be formutated 
in terms of the circumstances in which the tar- 
get language will be used, if possible in terms 
of communicative acts (e.g., “read the sports 
pages of the newspaper’; “take an office tele- 
phone message”). The problem then is to 
specify the nature of these acts as precisely as 
possible, and to break them into teachable/test- 
able units. As B,J. Carroll writes, “the needs 
of any individual aspiring user of a language 
derive from, and are specific to, the communi- 
cative encounters he is likely to experience. 
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js are to be described in the first 
non-language terms. ™ 
tws has yet to adequately explain how 


so we are not always certain exactly which vari- 
ables in communicative acts are the important 
ones. Still, it would seem that at Jeast the fol- 
lowing information should be clarified for cach 
communicative act in the objectives: the pur- 
pose of the interaction (including which topics 
will be treawd, related notions, and the lan- 
guage functions which the learner will need), 
situational aspects which will influence language 
behaviour (including the social and psychologi- 
cal roles and relationships of the participants 
and the settings in which communicative inter- 


action will take place), and the types of discourse 
which will be appropriate (re genre, variety, 


visual or auditory channel, etc.). It is also 
important to determine the degree of skill ex- 
pected of the learner or examinee. Such defini- 
tion of objectives will make it possible to de- 
termine the language forms (structures, words, 
and phrases) needed by the second-language 
speaker, or at least to specify the Ainds of authentic 
materials and interactions with native speakers which 
would expose the learner or examinee to appro- 
priate forms. 

Several detailed models now exist for speci- 
fying learner needs and objectives in the sec- 
ond language, notably those of Richterich, 
Munby, and most recently, a model by Carroll 
for language testing which draws on both of 
these.?? These models arc helpful in that they 
indicate the range of variables that might be 
considered in describing specific communica- 
tive needs. However, they have certain prac- 
tical shortcomings. Munby’s model, in particu- 
lar, is extremely complex. It may exceed in its 
level of detail what it is possible (or even use- 
ful) 10 know about a learner's or examinee’s 
future second-language needs. Its implemen- 
tation in needs specification is Jong and 
laborious as well. Because the models do not 
(and probably cannot) specify which variables 
are most important in determining language 
behaviour they may in fact lead to a distorted 
description of the “appropriate naturalistic” lan- 
guage use which is to be taught or tested. There 
is also the problem of native-speaker_norms 
versus what might be expected of a second-lan- 
guage speaker, Communication theory suggests 
that as much but only as much information 
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need be included in a tonnnunicatve interac 
tion as is essential to make the message clear. 
Clearly, native speaker norins, based on shared 
cultural knowledge and expectations as well as 
on high-level mastery of the verbal and other 
communicative systems, will not be entircly ap- 
propriate even for Nuently bilingual second- 
language speakers, and will be far beyond the 
reach of most sccond-language speakers, At the 
same time, native speakers adjust their speech 
and expectations when communicating with 
non-native speakers, simplifying and slowing 
their speech, repeating and pausing, and 
acecpting poor accents and social faux pas. 
Thus the objectives should ideally reflect con- 
sidcration of the level at which the given non- 
native learners can reasonably be expected to 
function in the second-language setting, as well 
as of compensatory communicative strategies 
(such as inferring from context, paraphrasing, 
asking for repetition, using gestures) which 
would usefully form part of their second-lan- 
guage communication skills.23 The important 
role of such strategies for non-proficient 
speakers interacting with native speakers is 
detailed by Stern in his description of Ontario- 
Quebec bilingual exchange visits for secondary 
school students.’ Since such strategies can help 
language learners to communicate in spite of 
gapsin grammatical discourse or sociolinguis- 
tic knowledge, they should be included among 
the objectives of ‘basic level language programs 
with a communicative orientation.” The meas- 
urement of such strategies must also be a 
consideration in communicative language test- 
ing- (This is not, however, to say that com- 
municative strategies should have the same 
importance in a theoretical characterization of 
mature native-speaker communicative compe- 
tence as they would in a characterization of a 
minimal second language “communicative 
competence” asa goal of instruction. The term 
is used with both meanings in the literature, 
often without clarification.) 

The Council of Europe's “threshold level” and 
B,J. Carroll's specifications for different levels 
of learner proficiency represent attempts to 
clarify differences between native-speaker com- 
petencies and appropriate goals for second-lan- 
guage programs and tests, Neither, however, 
take inte account the natiys speaker's adjust- 
ment of his or her language behaviour when 
interacting with non-natives, nor do they 
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UI 
specify communicative strategies needed by 
hon-proficient second-language users, Canale 
and Swain's domain description for (marie, 
core French program js notable for its inclu- 
sion of communicative strategies among in- 
structional goals.** 

The current alternative 10 reliance on needs- 
analysis models is an it 
Icarner_needs_and objectives, by someone 
familiar both with the target situations in which 
Jearners will have to use the second language, 
and with the local requirements of 
teaching and testing. (Such an intuitive 
sis can, of course, be informed by the existing 
theoretical models.) 

Once the communicative acts of interest are 
specified, the use of authentic target language 
materials appropriate to the learners objectives, 
and his or her interaction with native speakers 
in siwations similar to those specificd, can 
shortcut much of the need to comple de- 
scribe every aspect of each communie 
to be taught or tested. To the extent that the 
classroom can provide experiences of naturalis- 
tie language use (at the learner's level), the 

learner will be exposed 10 vocabulary, syntax, 
functions, discourse characteristics, and prag- 
matic relations which are appropriate to the 
relevant communicative acts. To the extent that 
a lest presents authentic language and com- 
munication tasks, with both a verbal (discourse 
Jevel) and an extralinguistic context, it will be 
evoking communicative performance, and thus 
approach as nearly as possible the evaluation 
of communicative competence. 


ISSUES IN COMMUNICATIVE TESTING 


Several problems arise in communicative 
approaches to language testing. One has to do 
with the extent to which performance in one 
situation is generalizable to another situation. 
Does the learner's ability to ask for information 
at the train station tell us anything about his 
or her ability to participate in a social gather- 
ing or read a newspaper? The more specific and 
Yimited the second-language objectives arc, the 
more precise and complete can be the perform- 
ance evaluation. But where communicative 
tasks are used to measure global proficiency in 
courses with general objectives, it is essential 
that the underlying co! unicative rule sys: 
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guistic code is the most context-fre 
the most generalizable of communi 
ties, thus justifying relatively greater atiention 
to this component in gencral purpose courses, 
It may also be possible to define “enabling” or 
“subsidiary skills” which underlie many dif- 
ferent communicative acts, as the appropriate 
focus for general purpose communicative test- 
ing.?? 

The generalizability of test results is related 
to the larger problem of establishing appropri- 
ate and reliable scoring criteria and procedures 
for communicative tests, Tests to determine 
whether the learner can function in given target 
language situations may require a global 
evaluation of whether he or she can “do” cer- 
tain things in thesecond language. Or particu- 
lar components of language behavior may be 
emphasized in the test tasks and scoring cri- 
teria, depending upon the panicular needs and 
backgrounds of the examinees. For example, 
high level mastery of grammatical, discourse 
and sociolinguistic components will be required 
for would-be translators and interpreters, 
These language components along with com- 
munication strategies will be important for for- 
eign students preparing to study in the second 
language. Immigrants entering jobs which 
require relatively little verbal interaction with 
native speakers might reasonably be evaluated 
more on their communication strategies, cer- 
tain sociolinguistic elements and specific 
vocabulary knowledge. If the purpose of test- 
ing is diagnostic or to evaluate progress in a 
language training program, detailed scoring 
grids might be in order, whereas global native- 
speaker judgements of whether or not the 
learner has the requisite second Janguage com- 
munication skills might be more appropriate 
for placement ôt entrance requirements. In a 
second-language training program in a bitin- 
gual area, such as Ottawa, much sociolinguistic 
knowledge would already be shared by the 
second-language learners, so that evaluation 
grids (as well as instructional procedures) would 
emphasize the linguistic components of com- 
municative competence. Foreign students, on 
the other hand, might be evaluated on the 
sociolinguistic appropriateness of thcir per- 
formance, as well. Thus, the initial specifica- 
tion of test objectives, based on a consideration 
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bf both the previous knowledge and the lan- 
guage neveda of the learner, will largely deter- 
mine the format and content of testing. h will 
the criteria by which examinee per- 
sto be judged. ‘The canstruction of 
s to reflect these criteria, and, 
where necded, the training of raters 10 judge 
perform present other sticky problems in 
communicative test construction for which 
there are no definitive answers. Nonetheless, 
the many communicative tests which have been 
developed to date provide models for further 
experimentation. 


scoring 


CHARACTERISTICS OF COMMUNICATIVE TESTS 


How, then, should we test communicative 


ability in the second language? 
J, We want to tap communicative and not 
only grammatical competence. Therefore our 


tests, to be valid, must activate the internalized 
rule systems by which discourse is meaningfully 
processed, including those by which socioli 
guistic variables influence language behaviour. 
Such tests should be integrative, pragmatic tests, 
involving the use of naturalistic language in 
both a verbal and situational context. Since we 
wish to test grammatical and discourse-level 
icative compe- 
vish to look explicitly at the usage 
jes of language (the correctness of 
t forms and their organization) as 
wed through language in we. Such 
evaluation, however, will center on the way the 
examinee’: responses are scored, rather than on 
the nature of the tasks presented. The test tasks 
criteria should also allow evalua- 
tion of the appropriateness of the examinees re 
sponses in terms of illocutionary development 
and the socivlinguistic variables present. At the 
least, they should require the examinee to 
recognize appropriate (versus inappropriate) 
responses,7# 

2. We want to test the examinee’s ability to 
meet specifiv target language needs in given 
situations, as defined in the objectives of the 
test. We want to know whether the learner can 
“do” something in the second language with an 
acceptable degree of efficiency (including such 
considerations as speed of processing and cor 
rectiess) as well as appropriateness to the situa- 
tion, Ideally, then, our tests should be as direct 
as possible, Role-playing, listening to ar read- 


authentic second-language material, and 
cing out realistic tasks in the language are 

s of simulating criterion situations). 

- We wish to test performance in a range 

tuations which reflect our objectives, thus 

non cases we will wish to test the examince’s 
inipufation of a variety of language functions. 
hile oral interview tests of speaking profi- 
{ncy may admirably evaluate performance of 
œ functions of narration and information-giv- 
ag in a formal, stranger/stranger, unequal- 
iatus communicative situation, and may also 
jve us information about the quality of the 
zxaminee’s language usage, they will probably 
tell us relatively little about his or her ability 
to “do” other kinds of things in the target lan- 
guage. The range and distribution of situations 
covered by our test items, then, should—at 
lean until we have a coherent theory of the 
generalizability of communicative enabling 
skills— correspond to the range and distribu- 
tion of our objectives. 

4. We want to know how well or how badly 
the examinee can meet particular objectives, 
Therefore we will ideally use criterion referencing 
in our tests, so that the performance of cach 
examinee is compared with a definition of ade- 
quate performance on the task rather than with 
the performance of other examinees, 

5. We want our tests to be reliable, This test 
characteristic is particularly problematic with 
respect to scoring criteria and procedures. 
Communicative tests of | global productive skills 
(c.g., free composition, oral interview) will 
Probably require a scoring method which uses 
global judgements by native speakers. High 
levels of inter- and intra-rater reliability can be 
achieved in such cases, but generally only 
through the careful training of raters and long 
experience with a particular test format and 
scoring grid (i.e., at considerable expense).2? 
We must carefully experiment with and ana- 
lyze the results of our new testing approaches 
to improve reliability. 

6. We want our tests to be feasible. Therefore 
they obviously cannot have all of the above 
desirable characteristics all of the time. How- 
ever, by taking into consideration these char- 
acteristics we can at least strive to make our 
tests better measures of communicative com- 
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EXAMI Y > OF COMMUNICATIVE TESTS 


This section will present ‘hive rests which 
aim to evaluate second language communica- 
tion skills. The first two, developed in Holland 
and in the United States, are based on Van Ek's 
specifications of threshold second-language 
objectives for European schoo! children and 
adults.2° The third, developed in Britain, 
exemplifies B.J. Carroll's system for test devel- 
opment, and reflects Munby's work on the 
specification of language needs, 


1. CITO Functional Dialogue Language Tests 
Language tests currently being developed for 


- Dutch secondary school students by CITO 


(The National Institute for Educational Meas- 

urement) illustrate the utilization of Council of 

Europe second-language objectives in measures 

of oral proficiency.” A brief description fol- 

lows: 

Formal: A cross-indexed set of situational, the- 
matic, and social skills “modules” in the form 
of written guidelines for dialogues and ac- 
companying illustrations, 

Purpose: To test oral communicative ability in 
the performance of speech acts in the second 
language (French, German, English). Usable 
both as classroom exercises and for profi- 
ciency assessment. 

Clientele: Secondary school students with ap- 
proximately four years of study in the lan- 
guage (three periods per weck). 

Specifications: Test objectives conform to overall 
objectives in the language courses concerned. 
Three types of target language behaviour 
have been selected, based on Situations, themes, 
and stereotyped social speech acts. Selection 
was based on a consideration of what stu- 
dents might nced to be able to do in the tar- 
get language during travel abroad or in 
encounters with foreign tourists in Holland. 
Appropriate levels of difficulty were estab- 
lished through Pretesting in the schools. 

A. Situations: Fifteen situations and subcompo- 

nents were specified, along with the roles of 

Participants, the language functions to be 

performed, and the specific notions needed. 
Example 
Situation: camping 
Sub-component: reception desk 
Roles: receptionist, guest 
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Language functions. asking for information, 
persuading, cte. 

Specific notions: site for a tent, equipment, 
departure time, ete. 

(Other situations include: “in a train compart- 

ment,” “shopping,” “at the police statiun,” ete.) 

B. Themes: Twelve themes, with sub-themes 

and language functions were specified. 
Example 
Theme; persona) data 
Sub-theme: name, address, age, etc. 
Language functions: identifying, qualifying, 

etc, 

(Other themes include: “daily life,” “holidays,” 

and various socia] and political problems.) 

C. Social speech acts: Seven social speech acts 

were specified. These include greeting, intro- 

ducing oneself, thanking, taking one’s Icave, 
etc, 

Procedures: Draft items were developed by 
CITO personnel, then extensively pretested 
in the schools and revised, 

Test description: Each test consists of guidelines 
for a dialogue. One role is played by the 
examiner (teacher) and the other by the 
examinee. Each dialogue presents ten se- 
quential tasks. 

Example 

Le Camping 

(1) “Le soir, vous arrivez à a réception du 
camping. La il y a une vieille dame. 
Saluez la dame.” 

(E) (“Bonsoir Madame”) 

(T) “La dame dit ‘bonsoir.’ Puis vous de- 
mandez une place à la dame.” 

(E) (“Je veux/voudrais camper ici,” “une 
place pour ma tente,” ete.) 

Scoring: Scoring is done by the examiner. An 
experimental grid was developed which pro- 
duced scores between 1 and 6 on each of the 
ten tasks on a given test, based on rater 
judgements of intelligibility, errors and pro- 
nunciation. Since inter-rater reliability was 
low in pretesting, a new scale is currently 
being developed.3? 

Comments 
‘These tests conform to many of the principles 

of communicative testing outlined previously. 

They are based on a theoretical model of com- 

municative competence, and on specific learner 

objectives deriving from it. They are interac- 
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nard, pragmatic tests (if properly ad- 

ministered), with verbal and situational con- 
text, The authenticity of the stimulus language 
might be disputed, along with the way in which 
Tanguage is used, since the examiner not only 
plays a dialogue participant role but continually 
instructs the learner on haw to respond. Again, 
authenticity of the task will vary with adminis- 
tration procedures, and the degree to which 
both participants are able to role-play. The tests 
are relatively direct in terms of their purpose 
(to evaluate oral interaction skills, and “doing” 
something in the language). Unpredictable re- 
sponses are allowed. Appropriateness of re- 
sponses is not taken into account in the original 
scoring system; however, it easily could be. The 
tests present varied situations and require a 
variety of language functions. They also pro- 
vide the possibility of criterion referencing. 
Reliability of scoring is as yct uncertain, as is 
the concurrent and predictive validity of the 
tests; however, both construct and face validity 
would appear high. With respect to the cri- 
terion of feasibility, the development of these 
tests required a considerable investment of time 
and expertise. In terms of administration and 
wide applicability, they are, however, very 
practical. They also provide a model which can 
be used by teachers in developing their own 
tests, 

2. Functional Test for English as a Second Language 

Students at UCLA 
Farhady has recently developed a new ap- 

proach to communicative test development, 

based on the Van Ek threshold spccifications 
for adults.23 

Format: A 64-item multiple choice written test 
based on common university situations in- 
valving foreign students. A situation involv- 
ing several participants is described. Exam- 
inces choose the one response of four which 
is both grammatically correct and socially 
appropriate. 

Purpose: To test “functional competence” (de- 
fined by Farhady as consisting of linguistic 
and socio-cultural competence) in oral com- 
munication, Scores are to be used in the 
placement of incoming foreign students in 
ESL courses. 

Specifications for test items: Two language func- 
tions, each with four subfunctions relevant 


notably ys e v 
g spel nnnutair n 
ate 


4. "Bey gve 
araa 
ee 


Communicative Testing in L2 


Briain, and represent authentic language tasks 

as well as texis. 

Format: Reading tests include authentic texts 
from which examinees must extract various 
kinds of uscful information, Listening tests 
Provide tape-recorded texts. The writing, 
tests include tasks such as filling out per- 
sonal information forms, addressing enve- 
lopes, answering letters, and leaving brief 
messages. The oral tests require the partici- 
pation of two native speakers, one to par- 
ticipate in a conversation with the examinee, 
given a simulated situation, and the other to 
evaluate the cxaminee's performance. Tasks 
include such functions as asking for and 

iving information, requesting help and 
giving advice, 

Purpose: To determine the degree to which for- 
eign students in Britain have the requisite 
English skills to "operate independently,” 

Clientele: Individuals over age sixteen for whom 
English is a second language, who wish to 
study in Britain, 

Specifications: The following are general con- 
tent areas.33 
“e Social interaction with native and non- 

native speakers of English. 
© Dealing with official and semi-official 
bodies. 
> Shopping and using services, 
> Visiting places of interest and entertain- 
ment. 
> Travelling and arranging for travel. 
> Using media for information and enter- 
tinment. 
e Medical attention and health. 
* Studying for academic/occupational/social 
purposes.” 
Detailed specifications exist for the degree 
of skill to be expected in cach mode at cach 
level. The criteria used for these specifica- 
tions in the case of reading tests follow. 
“Questions set in the tests of reading skills 
will take into account the following criteria 
in determining the degree of skill expected 
of the candidate. 
i} Size of the text which the candidate can 
handle. 
ii) Complexity of the text which the candi- 
date can handle. 
iii) Range of language forms which the can- 
didate can handle and comprehension 
skills which he can use. 
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iv) Speed at which texts can be processed 
and questions answered. 
Y) Flexibility in adopting suitable reading 
strategies for the task set and adapting 
fo developments in the text, 
YD Independence from sources of reference.” 
examples for reading: “Search through text 
to locate specific information.” “Study text 
to decide upon an appropriate course of 
action,” 
Procedures: Tests were developed and pilot tested 
in schools and colleges aver a period of 
several years. The preparation of new ver- 
planned on a continuing basis, 

ations will be administered in a num- 
ber of testing centres each spring. 

‘Test description: Test tasks are quite varied, 

Several items from specimen papers for 
ing and writing are reproduced below. 
Example 
Basie level reading 
“You wrote to the Tourist Information 
Office in Stratford last week, and they 
have sent this letter, as well as the official 
guide: 
Dear Sir/Madam, 
Thank you for your recent letter enquiring 
about tourist facilities in the Stratford- 
upon-Avon area, and asking us to make a 
hotel reservation for you." 
(The lener contains five additional para- 
graphs, on which a number of questions are 
based.) 
Sample question: 
“What does the letter tell you? 
A. They cannot reserve accommodation 
for you. 
B. They have reserved accommodation 
for you. 
©. You must pay 55p to reserve accom- 
mediation, 
Put a cross (x) through the right answer 
on your answer sheet.” 

Antennediate level writing 

“You have to go away tomorrow evening for 
twa days, but you unexpectedly receive this 
telegram from your friend. 
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YES THANKS STOP ARRIV- 
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Write a short note to your friend to pin on 

the door. Explain why you arc not there? 
and where the keys are.” 

Scoring: No published information is available 
on the scoring procedures used with these 
examinations. Clearly, many of the tasks 
would require global judgements by trained 
raters. The criteria used in scoring Writing 
and oral performance include accuracy in the 
use of these forms, and the range of language 
used by the examinee. Complexity is also a 
crnerion for writing, and flexibility and size 
are criteria for oral performance (sce pre- 
ceding descriptions)- 

Comments 
These tests conform well to the communica- 

tive testing principles discussed in this paper. 

‘They set imeraction-based, pragmatic, integra- 

tive language tasks. The language presented is 

naturalistic, with both verbal and extralinguis- 
tic context (as appropriate to the given task). 

Both the texts presented and the tasks set 

appear to be authentic, representing ways 

which people use janguage in everyday life. 

‘The tests are thus high on the direct scale. 

From the limited jnformation available, scor- 

ing procedures appear to take into account 

matical, sociolinguistic, and strategic 


concurrent and predictive validity quotients 
are, however, unknown, 2535 the reliability of 
the tests (although information on these aspects 
of the ‘examinations is currently being gathered). 
The tests represent à range of situations and 
language functions; they conform to specific 
objectives and thus would allow criterion ref- 
erencing, although results are given in terms 
of levels. The lengthy procedures involved in 


for organizations other than those for whom test 
development is a major business; nonetheless 
the type of tasks and texts used in the test are 
very suggestive for persons involved in lan- 


guage testing in any capacity, including dass- 


WORK IN COMMUNICATIVE 


‘The three preceding tests all represent large 

scale investments © penti: and 
fairly large dinde. 

r development isthe 
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af objectives based on detailed models of lan- 
guage needs. Where 3 teat is bring prepared 
for a restricted clientele with very specific 
aecond-language necds, n more intuitive de- 
xcription of needs and objectives may be both 
appropriate and more feasible, Hinofotis de- 
scribes such a test, developed at UCLA to 
c the English oral communicative ability 
reign students employed as teaching as- 
aans. In this test, examinees are video- 
taped as they carry ona simulated teaching task 
in English- Trained raters subsequently view 
the tapes, giving both global and detailed diag- 
nostic scores for aspects of linguistic profi- 
ciency. non-verbal behaviour and the effective- 
ness with which information is communicated. 
Acceptable reliability cocfficients have been 
achieved through the careful training of raters, 
and the test appears to serve well as 2 measure 
of whether the assistants have the requisite Eng- 
fish skills for various kinds of teaching assign” 
ments. 
Second language tests for specific groups in 
a bilingual university setting are currently 
being prepared at the Centre for Second Lan- 


for graduate students in history, which utilize 
texts selected by history professors from cur- 
rent academic journals. Multiple choice writ- 
wn comprehension questions require exam 
inees to extract information at various levels of 
detail, to infer meaning both from the selections 
presented and from their knowledge of the 
structure of historical argument, and to disin- 
guish between uch aspects of the message a$ 


veloped at the Centre is used to evaluate the 
functional French skills of English-speaking 
members of the Social Sciences Faculty. Test 
content and tasks are based precisely on the 
Faculty’s second-language requirements for 


their area 
and be dents’ 
administrati 
thus include oral 
French, listenin, 
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based on a tape-recorded excerpt from a 
faculty Council mecting (which examiners 
may answer in English), and a reading-com- 
rchension exercise based on a university 
memorandum detailing how to fill out el 
expense claims. One sub-test, individualized 
for each examinee, presents a one-page excerpt 
from a scholarly text in his or her area of 
cialization, to be summarized in English. 
Candidates’ oral responses are tape-recorded. 
The scoring criteria vary for the different sub- 
rests, but, for example, in the oral interaction 
test, equal weight is’ given by raters to under- 
standing the question, getting across an appro- 
priate answer, and the linguistic correctness of 
the response. Performance is rated by a team 
of experienced language teachers, but the final 
(criterion-referenced) judgement about whether 
examinee performance reflects the sccond-lan- 
guage skills needed on the job rests with the 
personnel committee and Dean of the Faculty. 


MAKING TESTS “MORE” COMMUNICATIVE. 


‘The preceding discussion and references pro- 
vide a number of suggestions about how to de- 
velop tests which to some extent tap features 
of communicative competence. These sugges- 
tions also have relevance where new tests will 
not be developed, but where items may be re- 
placed or changed. In situations where for prac- 
tical reasons one is “hooked into” traditional test 
formats (e.g., where rapid, mass testing is in- 
volved, communicative needs of examinees are 
heterogeneous or unknown, resources are not 
available to develop innovative tests, or instruc- 
tion is grammar-based), so that “ideal” com- 
munication tests will not be developed in the 
foreseeable future, it is at least possible to Bive 
some importance to communicative aspects of 
language. This can be done by including in 
existing test formats some sub-tests or items 
clearly based on a communicative view of lan- 


NOTES 


“This essay was selected by the editorial board of the 
Canadian, Moden Language Review (Anthony S. Molli 
Editor) as the best article ta appear in that publi 
19B1. Under terms of an exchange agreement betwren 
CMLR and The Modem Language Journal, it is reprinied here 


in 


nger (diseourse-level), context 
texts in stimulus items or multiple-choice 
sponses. (Reading passages, recorded dia- 
Jogues, and cloze tests all lend themselves to 
this possibility.) One can ask listening and 
reading comprehension questions based on a 
global understanding of the meaning conveyed 
through language use in context, rather than 
pinpointing discrete grammatical paints." One 
can ask questions about the appropriateness of 
contextualized language behaviours, or about 
situational aspects of an exchange (e.g., the 
roles, relationships, attitudes, and intended 
meanings of participants). Scoring of oral and 
written performance tasks can take into account 
getting the message across, different ways of 
expressing ideas, and appropriatencss, as well 
as grammatical aspects, Multiple choice re- 
sponses and acceptable cloze answers can be 
based on native speaker responses. 


CONCLUSION 


An examination of the theoretical constructs, 
issues, and practical work done so far in com- 
municative testing would appear to have rele- 
vance for anyone involved in second-language 
testing, no matter what the situation, By mak- 
ing our tests more reflective of the kinds of 
situations, language content and purposes for 
which second-language speakers wil! need their 
skills, we will be able to make more accurate 
Predictions about how well they will be able to 
function using the target language in “real life.” 
Such testing is likely to have dramatic effects 
on the format and content of second-language 
curricula, as well, and to improve student moti- 
vation through its increased relevance. The 
foregning discussion is an attempt to describe 
the underlying concepts and current issues in 
communicative testing, and to illustrate recent 
application of these ideas for practitianers.*? 
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